Migratory compression: coarse-grained data reordering to improve compressibility

نویسندگان

  • Xing Lin
  • Guanlin Lu
  • Fred Douglis
  • Philip Shilane
  • Grant Wallace
چکیده

We propose Migratory Compression (MC), a coarsegrained data transformation, to improve the effectiveness of traditional compressors in modern storage systems. In MC, similar data chunks are re-located together, to improve compression factors. After decompression, migrated chunks return to their previous locations. We evaluate the compression effectiveness and overhead of MC, explore reorganization approaches on a variety of datasets, and present a prototype implementation of MC in a commercial deduplicating file system. We also compare MC to the more established technique of delta compression, which is significantly more complex to implement within file systems. We find that Migratory Compression improves compression effectiveness compared to traditional compressors, by 11% to 105%, with relatively low impact on runtime performance. Frequently, adding MC to a relatively fast compressor like gzip results in compression that is more effective in both space and runtime than slower alternatives. In archival migration, MC improves gzip compression by 44–157%. Most importantly, MC can be implemented in broadly used, modern file systems.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Compression-Boosting Transform for Two-Dimensional Data

We introduce a novel invertible transform for two-dimensional data which has the objective of reordering the matrix so it will improve its (lossless) compression at later stages. The transform requires to solve a computationally hard problem for which a randomized algorithm is used. The inverse transform is fast and can be implemented in linear time in the size of the matrix. Preliminary experi...

متن کامل

Reordering for Better Compressibility

Compressed Sensing (CS) is a novel sampling paradigm that tries to take data-compression concepts down to the sampling layer of a sensory system. It states that discrete compressible signals are recoverable from sub-sampled data, when the data vector is acquired by a special linear transform of the original discrete signal vector. Distributed sampling problems especially in Wireless Sensor Netw...

متن کامل

A Tri-modal 2024 Al -B4C composites with super-high strength and ductility: Effect of coarse-grained aluminum fraction on mechanical behavior

In this study, ultrafine grained 2024 Al alloy based B4C particles reinforced composite was produced by mechanical milling and hot extrusion. Mechanical milling was used to synthesize the nanostructured Al2024 in attrition mill under argon atmosphere up to 50h. A similar process was used to produce Al2024-5%wt. B4C composite powder. To produce trimodal composites, milled powders were combined w...

متن کامل

Soil Compression Index Prediction Model for Fine Grained Soils

Compressibility of a soil mass is its susceptibility to decrease in volume under pressure and is indicated by soil characteristics like coefficient of compressibility, compression index and coefficient of consolidation. However, the determination of soil compressibility characteristics in the labs is a cumbersome and time consuming process, especially in the case of fine grained soils. In the p...

متن کامل

Modification of flow and compressibility of corn starch using quasi-emulsion solvent diffusion method

Objective(s):The aim of this study was to improve flowability and compressibility characteristics of starch to use as a suitable excipient in direct compression tabletting. Quasi-emulsion solvent diffusion was used as a crystal modification method. Materials and Methods: Corn starch was dissolved in hydrochloric acid at 80˚C and then ethanol as a non-solvent was added with lowering temperature ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014